Cooperative Reinforcement Learning Using an Expert- Measuring Weighted Strategy with Wolf
نویسنده
چکیده
Gradient descent learning algorithms have proven effective in solving mixed strategy games. The policy hill climbing (PHC) variants of WoLF (Win or Learn Fast) and PDWoLF (Policy Dynamics based WoLF) have both shown rapid convergence to equilibrium solutions by increasing the accuracy of their gradient parameters over standard Q-learning. Likewise, cooperative learning techniques using weighted strategy sharing (WSS) and expertness measurements improve agent performance when multiple agents are solving a common goal. By combining these cooperative techniques with fast gradient descent learning, an agent’s performance converges to a solution at an even faster rate. This statement is verified in a stochastic grid world environment using a limited visibility hunter-prey model with random and intelligent prey. Among five different expertness measurements, cooperative learning using each PHC algorithm converges faster than independent learning when agents strictly learn from better performing agents.
منابع مشابه
Fuzzy State Aggregation and Policy Hill Climbing for Stochastic Environments
Received (received date) Revised (revised date) Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually learn even as the operating environment changes. Additionally, by applying reinforcement learning to multiple cooperative software agents (a multi-agent system) not only allows each individual ag...
متن کاملExpertness based cooperative Q-learning
By using other agents' experiences and knowledge, a learning agent may learn faster, make fewer mistakes, and create some rules for unseen situations. These benefits would be gained if the learning agent can extract proper rules from the other agents' knowledge for its own requirements. One possible way to do this is to have the learner assign some expertness values (intelligence level values) ...
متن کاملWeighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments
Despite single agent deep reinforcement learning has achieved significant success due to the experience replay mechanism, Concerns should be reconsidered in multiagent environments. This work focus on the stochastic cooperative environment. We apply a specific adaptation to one recently proposed weighted double estimator and propose a multiagent deep reinforcement learning framework, named Weig...
متن کاملExpertness measuring in cooperative learning
Cooperative Learning in a multi-agent system can improve the learning quality and learning speed. The improvement can be gained if each agent detects the expert agents and use their knowledge properly. In this paper, a new cooperative learning method, called Weighted Strategy Sharing (WSS) is introduced. Also some criteria are introduced to measure the expertness of agents. In WSS, based on the...
متن کاملStrategic Concept Formation of Consumer Goods Based on Knowledge Acquisition from Questionnaire Data
Product’ concept formation, which occurs in the early stage of product development, is critical to the successfbl development of a new product or to the suitable improvement of a current product. We propose a novel method for computer aided strategic concept formation based on knowledge acquisition from questionnaire data. Product concept should be developed based on consumers’ needs that are u...
متن کامل